feat(Query): add query complexity framework#376
feat(Query): add query complexity framework#376kim-em wants to merge 13 commits intoleanprover:mainfrom
Conversation
a889930 to
5299afa
Compare
|
I have intentionally kept this at the level of "single tick" counting of costs. It is a fairly straightforward change to make this more flexible, which I have not done here to ease reviewing of the essential idea of using monad parametricity. |
| namespace Cslib.Query | ||
|
|
||
| structure TickT.State where | ||
| count : Nat |
There was a problem hiding this comment.
I would like to make this private, but this breaks a proof below in a way I haven't yet understood.
There was a problem hiding this comment.
I think you reported this issue to me in Dec (I hope it's the same one). I handed over the following reproducer to @Kha:
module
public structure State where
private count : Nat
private example {P : State → Prop} (h : P ⟨State.count s⟩) :
P ⟨State.count s⟩ := by
set_option trace.Meta.Tactic.simp true in
set_option trace.Debug.Meta.Tactic.simp true in
set_option trace.Debug.Meta.Tactic.simp.congr true in
set_option trace.Meta.realizeConst true in
simp [*]
private example {P : State → Prop} (h : P ⟨State.count s⟩) :
P ⟨State.count s⟩ := by
omega
private example {P : State → Prop} (h : P ⟨State.count s⟩) :
P ⟨State.count s⟩ := by
grindwe traced the crash down to mkCongrSimp?, IIRC it was due to getFunInfo. But I think we never fixed it; should have kept better track of this.
There was a problem hiding this comment.
Actually no, it appears this issue has been fixed! At least the reproducer works, and the following works as well (but proofs are broken on latest Mathlib due to DefEq changes...)
module
public import Std.Do.Triple.Basic
import Std.Tactic.Do
open Std.Do
public section
structure TickM.State where
private count : Nat
@[expose] def TickM (α : Type) := StateM TickM.State α
namespace TickM
instance : Monad TickM := inferInstanceAs (Monad (StateM TickM.State))
instance : LawfulMonad TickM := inferInstanceAs (LawfulMonad (StateM TickM.State))
instance : Std.Do.WP TickM (.arg TickM.State .pure) := inferInstanceAs (Std.Do.WP (StateM TickM.State) _)
def tick : TickM Unit := fun s => ⟨(), ⟨s.count + 1⟩⟩
@[spec]
private theorem tick_spec {Q : PostCond Unit (.arg TickM.State .pure)} :
Triple tick (fun s => Q.1 () ⟨s.count+1⟩) Q := by
simp [tick, Triple, wp, Id.run]So maybe try again to make the field private?
61d9a60 to
6b6a308
Compare
| @[expose] def TickT (m : Type → Type) (α : Type) := StateT TickT.State m α | ||
|
|
||
| /-- The tick-counting monad, specializing `TickT` to `Id`. -/ | ||
| @[expose] def TickM (α : Type) := TickT Id α |
There was a problem hiding this comment.
This is effectively identical to TimeM Nat
There was a problem hiding this comment.
I've now integrated / replaced the existing material using TimeM (renaming everything in this PR from Tick back to Time).
In particular, in this last commit I've added all the theorems about mergeSort's spec and runtime bounds using this framework.
sgraf812
left a comment
There was a problem hiding this comment.
Nice! It would be great if we could prove specs about Costs using mvcgen, though. Can take a look once I'm back from PTO
| namespace Cslib.Query | ||
|
|
||
| structure TickT.State where | ||
| count : Nat |
There was a problem hiding this comment.
I think you reported this issue to me in Dec (I hope it's the same one). I handed over the following reproducer to @Kha:
module
public structure State where
private count : Nat
private example {P : State → Prop} (h : P ⟨State.count s⟩) :
P ⟨State.count s⟩ := by
set_option trace.Meta.Tactic.simp true in
set_option trace.Debug.Meta.Tactic.simp true in
set_option trace.Debug.Meta.Tactic.simp.congr true in
set_option trace.Meta.realizeConst true in
simp [*]
private example {P : State → Prop} (h : P ⟨State.count s⟩) :
P ⟨State.count s⟩ := by
omega
private example {P : State → Prop} (h : P ⟨State.count s⟩) :
P ⟨State.count s⟩ := by
grindwe traced the crash down to mkCongrSimp?, IIRC it was due to getFunInfo. But I think we never fixed it; should have kept better track of this.
| count : Nat | ||
|
|
||
| /-- A monad transformer that adds tick-counting to any monad `m`. -/ | ||
| @[expose] def TickT (m : Type → Type) (α : Type) := StateT TickT.State m α |
There was a problem hiding this comment.
From my past (limited) experience with defeq abuse, it may make sense to make TickT @[irreducible] and add an explicit injection/projection pair.
| /-- Run a `TickT` computation, starting with tick count 0, | ||
| returning the result and the final tick count. -/ | ||
| def run [Monad m] (x : TickT m α) : m (α × Nat) := do | ||
| let (a, s) ← StateT.run x ⟨0⟩ |
There was a problem hiding this comment.
(defeq abuse, but probably fine if you formulate your lemmas about .run.)
|
|
||
| /-- Instrument a pure function as a tick-counted query. | ||
| `counted f a` increments the tick counter by 1 and returns `f a`. -/ | ||
| @[expose] def counted [Monad m] (f : α → β) (a : α) : TickT m β := do tick; pure (f a) |
There was a problem hiding this comment.
This function suggests the existence of counted2 [Monad m] (f : α → β → γ) (a : α) (b : β) : TickT m γ etc., similar to liftA<n> in Haskell. It's probably convenient to have counted, but I think it would be convenient to have counted2 and counted3 as well.
Although it would be reasonable for callers to just uncurry their functions before using counted. I think it doesn't scale if we also provide variants of RunsInT below.
There was a problem hiding this comment.
I think I'd prefer to leave the uncurried API for later, to reduce surface area here.
| /-- `RunsInT n f bound` asserts that when the monad-generic function `f` | ||
| is specialized to `TickT n`, with any query that calls `tick` at most once per invocation, | ||
| the total number of ticks is bounded by `bound x`. | ||
|
|
||
| The function `f` is generic over monads that extend `n` via `MonadLift`, | ||
| ensuring it cannot observe the tick instrumentation. -/ | ||
| @[expose] def RunsInT {n : Type → Type} {ps : PostShape} [Monad n] [WP n ps] | ||
| (f : ∀ {m : Type → Type} [Monad m] [MonadLiftT n m], (α → m β) → γ → m δ) | ||
| (bound : γ → Nat) : Prop := | ||
| ∀ (query : α → TickT n β), (∀ a, TickT.Costs (query a) 1) → | ||
| ∀ x, TickT.Costs (f query x) (bound x) |
There was a problem hiding this comment.
Neat!!
- I think you need to add somewhere (maybe to the top-level comment) that relying on parametricity is only credible when everything is computable. Otherwise you can have
if h : α = Nat then ... else ...byClassical.choice. - I wonder if callers can provide an
fthat uses higher-order constructs such asforloops indonotation. Can't quite play it out in my head right now, but I think it's one of the reasons that theIteratorLoopclass is so complicated.
There was a problem hiding this comment.
Done in 3737e39 — added a "Computability caveat" section to the RunsIn module doc-string, noting that a noncomputable algorithm could use Classical.choice to subvert instrumentation, and that this framework targets upper bounds, not lower bounds.
| induction xs with | ||
| | nil => exact TickT.Costs.pure () | ||
| | cons x xs ih => | ||
| simp only [List.length]; rw [Nat.add_comm] | ||
| have ih : TickT.Costs (mapSum query xs) xs.length := ih | ||
| exact TickT.Costs.bind (hquery x) (fun y => by | ||
| have := TickT.Costs.bind | ||
| (TickT.Costs.monadLift (modify (· + y) : StateM Int Unit) (fun P => by mvcgen)) | ||
| (fun _ => ih) | ||
| rwa [Nat.zero_add] at this) |
There was a problem hiding this comment.
I guess it would be nice to use mvcgen for this kind of proof. Can take a look when I'm back from PTO! You shouldn't need to replicate your own reasoning framework with Costs.pure/Costs.bind etc.
There was a problem hiding this comment.
Again, I tried to make something work here, but couldn't come up with anything that reduced the number of lines. :-)
| The predicate family `pre c` captures "the Int state is c" within the | ||
| abstract postcondition shape `ps`. The hypotheses `hf` and `h_modify` | ||
| assert that `f` preserves this predicate and the lifted `modify` transitions it. -/ |
There was a problem hiding this comment.
My gut says it yields simpler VCs if you write specs that are schematic in the post (like tick_spec) rather than the precondition
There was a problem hiding this comment.
I couldn't get anything useful out of this, but if you have time to investigate I'd be happy to hear.
| (h_modify : ∀ v c, ⦃pre c⦄ | ||
| (MonadLiftT.monadLift (modify (· + v) : StateM Int Unit) : m Unit) | ||
| ⦃⇓ _ => pre (c + v)⦄) | ||
| (xs : List Int) : | ||
| ∀ c, ⦃pre c⦄ mapSum f xs ⦃⇓ _ => pre (c + (xs.map g).sum)⦄ := by |
There was a problem hiding this comment.
This sounds to me like it could be expressed as a loop invariant lemma, similar to Spec.forIn_list etc.
|
@kim-em : This is duplicating #372 (refinement of #275) is it not? Also there I have a specific meaning of queries which are pre-declared inductive types there. I think the reuse of the word @sgraf812 : Related question, could we set up mvcgen for the framework in #372 ? Essentially it runs code in Id monad and measures complexity in |
I strongly disagree. You don't state your queries up front in your model. So you can't state reductions, lower bounds etc. all this makes use of explicit queries and custom cost functions. Instead you use a lean function as a parameter. This is a huge limitation in algorithms theory and complexity. In the model of #372, one can formally state multiple RAM models and even DSLs for Turing machines and circuits from circuit complexity. Therefore we can already define uniform circuit classes in complexity which involves writing programs in two different models with two different cost models (TMs and circuits). The fact that I can write circuits tells me that I can also encode parallel algorithms and the equivalence between circuits and parallel algorithms (there are several standard equivalences between these models). I have kept my pr limited to simple examples to keep it focused (the maintainers explicitly asked this). In fact it can even talk about lower bounds. In summary : the other PR allows for a comprehensive top down treatment of the multitude of models in algorithms and the relationships between them(one of the standard models like RAM and TM could be sink nodes in this relationship network). What this PR is doing is equivalent to compressing the process of defining a query and an evaluation function (and a cost function) in my PR, by directly providing the interpreted version as a parameter (cmp). But it is losing a lot in the process. So, you get a slight generalisation from evaluating in
|
@Shreyas4991, you are misunderstanding the purpose of this PR, and how we achieve security against cost manipulation via parametricity.
#372 doesn't "exist" for the purposes of that suggestion, as it hasn't been merged. I suggest comparison between #376 and #372 stay on Zulip, rather than in the PR discussions. |
In this instance I am certain I do. I am the domain expert here as far as algorithms is concerned. This is merely short-circuiting my approach and shares the exact same limitation mine does, with write-queries, and with a less than optimal design for the purposes my PR serves. The "generalization" is merely trying to hide the evaluation behind an arbitrary monad instead of a free monad with an uninterpreted query. |
There is no suggestion in that guidelines that a contributing PRs work can be overriden without discussion if it isn't merged already. It is also a core principle of open source etiquette to discuss an idea first, especially if someone is already working on it before making one's own contribution. This ensures that the work and effort of contributors is not devalued or outright erased. I have adhered to this principle with my PR from the beginning via zulip discussions. This PR has violated it.
|
|
@Shreyas4991, you mentioned that "our models have the same security from misuse of return." I don't think that's the case. Here's a fully computable algorithm in the framework of #372 that "sorts in zero queries": /-- Sorting with zero queries. -/
def freeSort {α : Type} (le : α → α → Prop) [DecidableRel le]
(l : List α) : Prog (SortOpsInsertHead α) (List α) :=
.pure (l.insertionSort le)
theorem freeSort_correct {α : Type} (le : α → α → Prop) [DecidableRel le] (l : List α) :
(freeSort le l).eval (sortModel le) = l.insertionSort le := by
simp [freeSort]
theorem freeSort_free {α : Type} (le : α → α → Prop) [DecidableRel le] (l : List α) :
(freeSort le l).time (sortModel le) = 0 := by
simp [freeSort]These theorems together assert that sorting can be done correctly with zero comparisons. The root cause: In the monad-parametric approach of #376, the comparison is an opaque parameter To be clear about the limitations of #376: because Lean has classical logic, a The exploit above is not about classical logic — |
It's not an exploit because you are including an instance that the model explicitly does not require . You can do this in your model as well. Concretely take your decidable LE instance and lift, choose a specific monad with a liftM function and you can also lift You are inserting an extra typeclass instance into my def. I can conversely insert a special monad into yours and erase the parametricity. Neither of these are "exploits" If providing extra instances is fair game then so is specializing the monad parameter to yours. And to be clear, based on my discussion with type theorists, adding proper foolproof parametric polymorhism to a dependently typed system is an open problem (concretely something like ML style modules). Rocq's module system has been mentioned to me as an example (and they just uncovered a soundness bug there yesterday). So if you solve parametricity in lean, please submit it to POPL. |
|
@kim-em to be clear, a Prog should never have typeclass instances. That's the easiest red flag to catch. please see the actual linearSearch example. It explicitly avoids giving Secondly you made a claim about your goals. They are not my goals. The query model is a model that I am building based on what I know will work for algorithms theory. I am working on something that I can use for algorithms theory. The code verification stuff can be handled by Boole and is not my issue. It's arguably better suited for the problem. Thirdly this PR is just reinventing WriterT monad transformer |
|
Lastly parametric polymorphism requires that I should not be able to supply an explicit |
|
@kim-em : Update, I just changed the Prop model to a Bool le model for my sorting query and algorithms in cslib#372 . So no But as I have mentioned before, nothing stops someone from writing So this "exploit" you have found is equivalently applicable in both models (or more clearly, not), since any monadic DSL will allow pure operations and bad instantiation of monads and other parameters (that's a typeclass vs module system problem). Human reviewing is the only guard. What both models do is substantially reduce the surface area for error. Also (and this is exactly why I think this PR is misguided from a contribution guidelines standpoint), you could have just made a PR to change my model to a Concretely, in a proper approach, you would have to make a PR on top of #372 :
Instead your PR lays claim to the idea of query complexity in the title, an idea that I have been promoting since even before the debate formalization, and then squanders the power of my model by losing its explicit inductive query types (while also depriving all of us authors the authorship credit that is due), which I have crafted to handle a huge chunk of algorithms theory literature in a top down fashion. |
You continue to misunderstand this PR. I'd prefer to leave this to reviewers at this point. I appreciate you don't like my implicit criticism of #372 here. |
I would prefer criticism that was accurate and constructive and that actually improved my PR. Seeing as this was neither, and also making false claims about the problem being inherent to a free monad approach, but so blatantly supplying an instance to a Prog, yes I object. I also object to claims that this PR is better than 372 when 372's objectives are misunderstood. |
There was a problem hiding this comment.
This PR should be deleting other models and their examples. These can actually be used.
| count : Nat | ||
|
|
||
| /-- A monad transformer that adds tick-counting to any monad `m`. -/ | ||
| @[expose] def TimeT (m : Type → Type) (α : Type) := StateT TimeT.State m α |
There was a problem hiding this comment.
This should be a writer monad. A time counting monad is an additive log.
Cslib/Algorithms/Lean/TimeM.lean
Outdated
There was a problem hiding this comment.
Again. This PR is deleting a file based on the presumption that it is no longer needed. I don't believe we have settled on deleting all other models in favour of this PR. My model builds on top of this PR. So this deletion rests on undecided assumptions and breaks other models.
There was a problem hiding this comment.
Yes, clearly the PRs are incompatible.
There was a problem hiding this comment.
They are separate models. They can exist in separate folders. The code itself can be reused elsewhere. Deleting it is breaking other PRs including mine without fixes. The current Zulip discussion has focussed on retaining more than one model. This PR ignores that by deleting an existing model before any decision has been reached.
|
@kim-em I guided GPT to find two hacks in this "parametricity" approach, with the same level of tricks as the aforementioned "hack" in mine. It took less than 5 minutes to type up the explanation to GPT and have it generate working code while I was simultaneously at a workshop understanding a different proof. I hope this clarifies that I actually understand parametricity better and we can do away these silly accusations that I misunderstand this PR. Especially since they were made when I pointed out how I'd reproduce your hacks in this PR and actually produced them. Parametric polymorphism in lean for algorithms was actually discussed in the lean discord in mid 2024. EDIT: The docstring claims are also too strong. You clearly don't need to use the provided |
Shreyas, obviously proposing to do something in one way does not give you a veto on attempts to do that thing other ways. Equally obviously, I am aware of your work, you have talked about it extensively. I am not particularly interested in having a long discussion with you about this: words are cheaper for you than they are for me! |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a general correctness theorem for mapSum parameterized by an abstract predicate family `pre : Int → Assertion ps`, which captures "the Int state is c" for any postcondition shape. Both mapSum_spec and mapSum_spec_tick are now derived as corollaries, removing the TODO comment about the desired generalization. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rt at Id Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce explicit `TickT.mk` and `TickT.unfold` to bridge between `TickT m α` and `StateT TickT.State m α`, with simp lemmas and ext. Rewrite Monad, MonadLift instances and `tick`/`run`/`run'` definitions to go through mk/unfold rather than relying on defeq. This makes the code more robust and easier to tighten reducibility settings later. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
A noncomputable algorithm could use Classical.choice to inspect the monad or query function and subvert tick instrumentation. Document that RunsIn theorems should only be proved about computable algorithms, and that this framework targets upper bounds, not lower bounds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Unify `TickT`/`TickM` with `TimeT`/`TimeM` per PR review, deleting the old product-monad `TimeM` and its `MergeSort` implementation. Replace with monad-generic `merge`/`mergeSort` in the Query framework, proved correct (`merge_perm`, `mergeSort_perm`, `id_run_merge`) and with complexity bounds (`merge_costs`, `mergeSort_runsIn`). Also fix all linter warnings in Basic.lean and InsertionSort/Lemmas.lean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Show that orderedInsert, insertionSort, merge, and mergeSort produce sorted (Pairwise r) output, both at m := Id and at m := TimeT n with a pure comparator. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Id defeq abuse Define `PureReturn` predicate for monadic functions with pure returns. Generalize sorted-result theorems from TimeT-specific to monad-generic, deriving _id and _timeT corollaries. Replace all `simp [Id.run, Pure.pure]` with `Id.run_pure`/`Id.run_bind` and wrap Id comparators in `pure`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
285f4ae to
7a39a7c
Compare
|
Please note that I have broken your parametericity based security claims with two hacks very similar to yours and established my earlier claim that the models are equivalent in security. I have provided the link to the examples. Not only have I disproved your claim that I misunderstand your pr. I have also established that you misunderstand parametric polymorphism, given
You say you value your words more. all I am seeing in this PR is false or no justifications for design decisions and a tonne of code. Code is cheap. Anybody can code. LLMs can code. Design is important. Design >>> code. Hence I talk before making big PRs which is the established OSS etiquette. I rest my case. I have established that this PR offers no more "security against hacks" than mine. I fully recognize that a lightweight framework will invoke tradeoffs between automated guards and reviewers checking definitions. I find this discussion increasingly toxic and I am forced to continue this to defend my work against incorrect and also not particularly relevant claims, among other things. |
This is not about vetoing. This about a lack of prior discussion about an overlapping PR similar to what I did before starting my PR. This is about exploring the possibility of building on top of existing approaches. This is basic OSS etiquette. Code is cheap. Design is expensive. This etiquette is very clear that words (discussions) about new and overlapping contributions are more valuable than just coding up a PR. This PR is a clear violation of basic OSS etiquette. |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ort bound - Restore copyright to Sorrachai Yingchareonthawornhcai in Basic.lean, MergeSort/Defs.lean, and MergeSort/Lemmas.lean (from deleted TimeM.lean and MergeSort/MergeSort.lean) - Add Eric Wieser as author on Basic.lean (from deleted TimeM.lean) - Update copyright year to 2026 for new files (InsertionSort/*, MonadicExample) - Change insertionSort_runsIn bound from `xs.length * xs.length` to `xs.length ^ 2` Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This PR adds infrastructure for proving upper bounds on the number of queries
(comparisons, oracle calls, etc.) an algorithm makes, using monad parametricity to ensure validity of the bounds.
TickT mmonad transformer with tick counting,Costspredicate, and combinators(
Costs.pure,Costs.bind,Costs.monadLift,Costs.ite, etc.)RunsIn/RunsInTpredicates packaging the parametricity argumentmapSumwith functional correctness under tick instrumentationQuoting from the module doc for
RunsIn:RunsIn f bound(and its generalizationRunsInT) assert that a monad-generic algorithmfmakes at most
bound xqueries on inputx.An algorithm like
is written generically over the monad
m. To measure its query complexity, we specializemtoTickM(orTickT nfor algorithms with additional effects) and provide acmpimplementation that callstickonce per invocation.Because
insertionSortis parametric inm, it cannot observe the tick instrumentation.It must call
cmpthe same number of times regardless of which monad it runs in.Therefore any upper bound proved via
TickMis a true bound on query count in all monads.🤖 Prepared with Claude Code